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Abstract 



o 

, We equip the polytope of n x n Markov matrices with the normahzed 

trace of the Lebesffue measure ofM" . This probabihty space provides random 
Q ' Markov matrices, with i.i.d. rows following the Dirichlet distribution of mean 

. • • • ) l/'T-)- We show that if M is such a random matrix, then the empirical 

\ distribution built from the singular values of\/nM tends as n — > oo to a 

Wigner quarter-circle distribution. Some computer simulations reveal striking 
asymptotic spectral properties of such random matrices, still waiting for a 
p~j \ rigorous mathematical analysis. In particular, we believe that with probability 

one, the empirical distribution of the complex spectrum of -y/nM tends as 
n — > oo to the uniform distribution on the unit disc of the complex plane, and 
that moreover, the spectral gap of M is of order 1 — 1/ \/n when n is large. 



AMS 2000 Mathematical Subject Classification: 15A52; 15A51; 15A42; 60F15; 62H99. 
Keywords: Random matrices; Markov matrices; Dirichlet laws; Spectral gap. 
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^ ; 1 Introduction 

^ ■ Markov chains constitute an essential tool for the modelling of stochastic phenomena 

. in Biology, Computer Science, Engineering, and Physics. It is nowadays well known 

^ ' that the trend to the equilibrium of ergodic Markov chains is related to the spectral 

decomposition of their Markov transition matrix, see for instance |Sen06t ISC97t 
^ ■ ICSCOSj . The corresponding literature is very rich, and many statisticians including 

■ for instance the famous Persi Diaconis contributed to this subject, by providing 

quantitative bounds for various concrete specific Markov chains. But how a Markov 
chain behaves when its Markov transition matrix is taken arbitrarily in the set 
of Markov matrices? The present paper aims to provide some partial answers to 
this natural concrete question. From the statistical point of view, one can think 
about considering random Markov matrices following the "uniform law" over the 
set of Markov matrices, which corresponds to a maximum entropy distribution or 
Bayesian prior, see for example |DR06] . Recall that a. n x n square real matrix M 
is Markov if and only if its entries are non-negative and each row sums up to 1, i.e. 
if and only if each row of M belongs to the simplex 

An = {{xi, . . . , Xn) € [0, 1]"" such that Xi + ■ ■ ■ + Xn = 1} (1) 
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which is the portion of the unit ||-|| ^-sphere of M" with non-negative coordinates. 
The spectrum of a Markov matrix hes in the unit disc {z G C; \z\ ^ 1}, contains 1, 
and is symmetric with respect to the real axis in the complex plane. 

Uniform distribution on Markov matrices 

Let Ain be the set of n x n Markov matrices. We need to give a precise meaning to 
the notion of uniform distribution on A4n- This set is a convex compact polytope 
with n{n — 1) degrees of freedom if n > 1. It has zero Lebesgue measure in . 

Since A1„ is a polytope of (i.e. intersection of half spaces), the trace of the 
Lebesgue measure on it makes sense and coincides with a cone measur^, despite its 
zero Lebesgue measure in . Since is additionally compact, the trace of the 
Lebesgue measure can be normalized into a probability distribution. We thus define 
the uniform distribution U{M.n) on M.n as the normalized trace of the Lebesgue 
measure of . The following theorem relates U{Ain) to the Dirichlet distribution. 

Theorem 1.1 (Dirichlet Markov Ensemble). We have M ~ U{M.n) 'if md only 
if the rows of M are i.i.d. and follow the Dirichlet law of mean (^, . . . , ^). The 
probability distribution U{Ain) is invariant by permutations of rows and columns. 

Corollary 1.2. //M ~ U{Mn) then for every I ^ i,j ^ n, Mjj ~ Beta(l, n - I) 
and for every 1 ^ i, i',j,j' ^ n, 

(0 tfi^i' 
Cov(M,,„ M,,,,0 = \ ifi = i' and j = f 

[-^^^(hr) ^f^ = ^' andj ^f. 

Moreover Mj j and Mj/j/ are independent if and only if i ^ i' . 

The set is also a compact semi-group for the matrix product. The following 
two theorems concern the translation invariance of U{M.n) and the question of the 
existence of an idempotent probabihty distribution on A1„. 

Theorem 1.3 (Translation invariance). For every T G M-m the law U{M.n) is 
invariant by the left translation M i-^ TM if and only if T is a permutation matrix. 
The same holds true for the right translation M i-^ MT. 

Theorem 1.4 (Idempotent distributions). There is no probability distribution on 
M.n, absolutely continuous with respect to U{M.n), with full support, and which is 
invariant by every left translations M i— TM where T runs over Ain- The same 
holds true for right translations. 

The proofs of theorems 11.11 II. 3[ 11.41 and corollary 11.21 are given in section [2j 

"'^ Actually, one can define the trace of tlie Lebesgue measure and tlien the uniform distribution 
on many compact subsets of the Euclidean space, by using the notion of Hausdorff measure [Fal03| . 
See also |CPSV09] for an approximate simulation method based on billiards and random reflections. 
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Asymptotic behavior of singular values and eigenvalues 

The spectral properties of large dimensional random matrices are connected to many 
areas of mathematics, see for instance the books |Meh04^ IHPOO[ IBS061 IAGZ09[ 
IForOQl lEROSj and the survey |Bai99] . If M ~ U{Mn), then almost surely, the real 
matrix M is invertible, non-normal, with neither independent nor centered entries. 
The singular values of certain large dimensional centered random matrices with 
independent rows is considered for instance in |Aub06] and |MP06t IPP07 . 

For any square n x n matrix A with real or complex entries, let the complex 
eigenvalues Ai(A), . . . , A„(A) of A be labeled so that |Ai(A)| ^ ■ ■ ■ ^ |A„(A)|. The 
spectral radius of A is thus given by |Ai(A)| = maxi^^^^ |Afc(A)|. The empirical 
spectral distribution (ESD) of A is the discrete probability distribution on C with 
at most n atoms defined by 

1 " 

-5Z5a,(a). 

k=l 

The singular values si(A) ^ ■ ■ ■ ^ s„(A) ^ of A are the eigenvalues of the positive 
semi-definite Hermitian matrix (AA*)^/^ where 

A* = A^ 

denotes the conjugate transpose of A. Namely, for every 1 ^ k ^ n, 

Sk{A) = XkiVAA^) = A/Afc(AA*). 

Note that A A* and A* A share the same spectrum. The atoms of the ESD of 
V A A* are si(A), . . . , Sn(A). The singular values of A have a clear geometrical 
interpretation: the linear operator A maps the unit ball to an ellipsoid, and the 
singular values of A are exactly the half-lengths of its principal axes. In particular, 
Si(A) = max||^||2=i ||Ax||2 = ||A||2^2 ^^ile s„(A) = minna-n^^i ||Ax||2 = ||A"i||2:^2- 
Moreover, A has exactly rank(A) non zero singular values. The relationship between 
the eigenvalues and the singular values are captured by the Weyl-Horn inequalities 



k 

yk E {1, . . . ,n}, JJ^ |Aj(A)| ^ JJ^ Si(A) with equality when /c = tt,, 

i=l i=l 



see |Hor54t |Wey49| . If A is normal, i.e. AA* = A*A, then Sk{A) = |Afc(A)| for 



every 1 ^ k ^ n. Back to our Dirichlet Markov Ensemble, if M ~ U{A4n) then M is 
almost surely a non-normal matrix, and thus one cannot express the singular values 
of M in terms of the eigenvalues of M. The following theorem gives the asymptotic 
behavior of the empirical distribution built from the singular values of M. 

Theorem 1.5 (Singular values for Dirichlet Markov Ensemble). Let (Xjj)i^jj<oo 
be an infinite array of i.i.d. exponential random variables of unit mean. For every 
n, let M be the n x n random matrix defined for every 1 ^ i, j ^ n by 
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Probability distribution name 


Support 


Lebesgue density 


Circle or circular law Co- 


{z e (L; \z\ $ (t| C IL 


z I— > (vrtT j 


Wigner semi-circle distribution 


[-2(7, +2<t] C R 


x (27rcr2)- V4(72 - a;2 


Wigner quarter-circle distribution 


[0, 2cr] C R 




Marchenko-Pastur distribution 


[0,4ct2] C M 





Table 1: Some of the remarkable probability distributions in random matrices. 



Then M ~ W(A<„) and 




where denotes the weak convergence of probability distributions andVi the Marchenko- 
Pastur distribution defined in table Ul In other words, 




where Qi denotes the Wigner quarter-circle distribution defined in tableUi 



Following the notations of table [H for every real fixed parameter cr > 0, every 
real random variable W, and every complex random variable Z = U + \/—lV with 
U = RealPart(Z) and V = ImaginaryPart(Z), we have, by a change of variables, 

[W^ ^ ^ \W\ Q^) and {W ^ ^ ^ and \W\ ^ Q^) . 

Moreover, we have, simply by using the Cramer- Wold theorem, 

Z ~ Ca^ ^ (^RealPart(e^^Z) ~ for every 6 E [0, 27r)). 

In particular, we have 

Z r^C2a f/ ~ and V ~ W^. 

Beware however that U and V are not independent random variables! Furthermore, 
if P(|Z| = a;V ^ 0) = 1 then Z follows the uniform distribution over the upper 
half circle of radius a if and only if U follows the so-called arc-sine distribution on 
[—a, +0"] C R with Lebesgue density x ^-^ (ttvo^ — a?)""^- 

The proof of theorem 11.51 is given in section [31 Since |Ai(A)| ^ ■Si(A) for any 
square matrix A, and since Ai(M) = 1, we have for every n ^ 1 

si(M) ^ |Ai(M)| = 1. 

However, theorem 11.51 implies in particular that almost surely 

- Cardl 1 ^k such that Sfc(M) > 1 — > 0. 
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Random Q-matrices 



Bryc, Dembo, and Jiang studied in [BDJOGj the hmiting spectral distribution of ran- 
dom Hankel, Markov, and Toeplitz matrices. Let us explain briefly what they mean 
by "random Markov matrices" . They proved the following theorem (see |BDJ06t th. 
1.3] and also jBSOS] ) : let (X^^ 

j)i<i<j<oo be an infinite triangular array of i.i.d. real 
random variables of mean and variance 1. Let Q be the symmetric n x n random 
matrix defined for every l^i^j^nhj Qij = Qj^i = Xij ii i < j, and 

= - ^ Qi,k ioT every 1 ^ i ^ n. 

Then, almost surely, the ESD of Q converges as n — >■ oo to the free convolutioij^ 
of a semi-circle law and a standard Gaussian law. 

This result gives an answer to a precise question raised by Bai in his 1999 review 
article |Bai99| sec. 6.1.1]. The matrix Q is not Markov. However, it looks like 
a Markov generator, i.e. a Q-matrix, since its rows sum up to 0. Unfortunately, 
the assumptions do not allow the off-diagonal entries of Q to have non-negative 
support, and thus Q cannot be almost surely a Markov generator. In particular, if 
I stands for the identity matrix of size n x n, the symmetric matrix M = Q + I 
cannot be almost surely Markov. 

Eigenvalues and the circular law 

If M is as in theorem II. 5[ then Xl{^/n'M.) = ^Jn goes to +oo as n ^ oo while its 
weight in the ESD is \jn. Thus, it does not contribute to the limiting spectral dis- 
tribution of a/wM. Numerical simulations (see figured]) suggest that the empirical 
distribution of the rest of the spectrum tends as n ^ oo to the uniform distribution 
on the unit disc. One can formulate this conjecture as follows. 

Conjecture 1.6 (Circle law for the Dirichlet Markov Ensemble). // M is as in 
theorem IT75[ then 

where denotes the weak convergence of probability distributions andCi the uniform 
distribution over the unit disc {z ^C;\z\ ^1} as defined in tableUi 

The main difficulty in conjecture 11.61 lies in the fact that M is non-normal 
with non i.i.d. entries. The limiting spectral distributions of non-normal random 
matrices is a notoriously difficult subject, see for instance |TVK08j . The method 
used for the singular values for the proof of theorem 11.51 fails for the eigenvalues, 
due to the lack of variational formulas for the eigenvalues. In contrast to singular 
values, the eigenvalues of non-normal matrices are very sensitive to perturbations, 
a phenomenon captured by the notion of pseudo-spectrum [TE05j . The reader 
may find in |Cha08] a more general version of theorem 13.11 which goes beyond the 
exponential case, and some partial answers to conjecture II. 6[ 

^This limiting spectral distribution is a symmetric law on R with smooth bounded density of 
unbounded support. See |HPOO| or [Bia97| for Voiculescu's free convolution. 
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Sub— dominant eigenvalue 

The fact that non-centered entries produce an explosive extremal eigenvalue was 
already noticed in various situations, see for instance |And90] . |Sil94j . |BDJ06[ th. 
1.4], [BS07], and |Cha07] . It is natural to ask about the asymptotic behavior (conver- 
gence and fluctuations) of the sub-dominant eigenvalue A2(M) when M ~ W(A^„). 
The reader may find some answers in |GN03l [GONSOOj , and may forge new conjec- 
tures from our simulations (see figures [2] and [3]) . For instance, by analogy with the 
Complex Ginibre Ensemble |Kos92t IRid03] , one can state the following: 

Conjecture 1.7 (Behavior of sub-dominant eigenvalue and spectral gap). //M is 
as in theorem \1.5l then Ai(M) = 1 while 

pf lim v^|A2(M)| = l) = 1. 

\n— >co / 

In particular, the spectral gap 1 — |A2(M)| of M is of order 1 — for large 

n. Moreover, there exist deterministic sequences (a„) and {pn) and a probability 
distribution Q onM. such that 

6„(|A2(M)|-a„) g 

n^oo 

where denotes the convergence in law. 

There is not clear indication that ^ is a Gumbel distribution as for the Com- 
plex Ginibre Ensemble. Moreover, our simulations suggest that the sub-dominant 
eigenvalue is real with positive probability (depends on n), which is not surprising 
knowing |Ede97l IEKS94j . Note that Goldberg and Neumann have shown |GN03j 
that if X is an nxn random matrix with i.i.d. rows such that for every 1 ^ i,j,j' ^ n, 

E[Xij] = ^, and Var(Xi,,) = o(^-J^^, and |Cov(X,j, X^y) | = O 

then P(|A2(X)| ^r) ^ p for any p G (0, 1), any < r < 1, and large enough n. This 
is the case if we set X = M. 

Other distributions 

The Dirichlet distribution of dimension n and mean (-,...,-) is the uniform distri- 
bution on the simplex A„ defined by ([T]). One can replace the uniform distribution 
by a Dirichlet distribution of dimension n and arbitrary mean. The argument used 
in the proof of theorem 11.51 remains the same due to the very similar construction of 
Dirichlet distributions by projection from i.i.d. Gamma random variables. One can 
also replace the ||-||j^-norm by any other || -Hp-norm, and investigate the limiting spec- 
tral distribution of the corresponding random matrices. This case can be handled 
with the construction of the uniform distribution by projection proposed in |SZ90j . 
Replacing the non-negative portion of spheres by the non-negative portion of balls 
is also possible by using [BGMN05] . More generally, one can consider random ma- 
trices with independent rows. The case of the uniform distribution on the whole 
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unit II -lip-ball of M" is considered for instance by in |Aub06] by using [BGMN05] 
together with random matrices results for i.i.d. centered entries. It is crucial here 
to have an explicit construction of the distribution from an i.i.d. array. For the link 
with the sampling of convex bodies, see |Aub07] . The case of matrices with i.i.d. 
rows following a log-concave isotropic distribution is considered in the recent work 
|PP07j . by using recently developed results on log-concave measures. The reader 
may find a universal version of theorem 13.11 in |Cha08] , where the exponential law 
is replaced by an arbitrary law. 

Doubly Stochastic matrices 

The Birkhoff or transportation polytope is the set of nxn doubly stochastic matrices, 
i.e. matrices which are Markov and have a Markov transpose. Each nxn doubly 
stochastic matrix corresponds to a transportation map of n unit masses into n boxes 
of unit mass (matching), and conversely, each transportation map of this kind is a 
nxn doubly stochastic matrix. Geometrically, the Birkhoff polytope is a convex 
compact subset of A^n of zero Lebesgue measure in and {n — 1)^ degrees of 
freedom if > 1. As for Ain-, one can define the uniform distribution as the 
normalized trace of the Lebesgue measure. However, we ignore if this distribution 
has a probabilistic representation that allows exact simulation as for U{M.n). The 
spectral behavior of random doubly stochastic matrices was considered in the Physics 
literature, see for instance [BerOlj . On the purely discrete side, the Birkhoff polytope 
is also related to magic squares, transportation polytopes and contingency tables, 
see |DE871 IDE85j and |Dh95] . Note also that if M is Markov, then MM^ and 
|(M + M""") are not Markov in general. However, this is the case when M is doubly 
stochastic. The Birkhoff-von Neumann theorem states that the extremal points of 
the Birkhoff polytope are exactly the permutation matrices. The reader may find 
nice spectral results on random uniform permutation matrices in |HKOS00l IWieOO] 
and references therein. 

Another interesting polytope of matrices is the set of symmetric nxn Markov 
matrices, which is a convex compact polytope of zero Lebesgue measure in R" with 
]^n[n — 1) degrees of freedom if n > 1. As for A^„, one can define the uniform 
distribution as the normalized trace of the Lebesgue measure. However, we ignore 
if this distribution has a probabilistic representation that allows simulation as for 
U{Ain)- One can ask about the spectral properties of the corresponding random 
symmetric Markov matrices. Note that these matrices are doubly stochastic, but the 
converse is false except when n = 1 orn = 2. Our construction of U{Ain) in theorem 
11.51 corresponds in the Markovian probabilistic jargon to a random conductance 
model on the complete oriented graph. The study of the spectral properties of 
random reversible Markov conductance models on the complete non-oriented graph 
can be found in |Cha09t IBCCOSt IBCCOQj . For other graphs, the reader may find 
some clues in |BDPX05] . 

Let M be as in theorem 11.51 Numerical simulations suggest that almost surely, 
the ESD of the symmetric matrix |(M + M""") tends, as n — > oo, to a semi-circle 
Wigner distribution. 

If U is an n X n unitary matrix, then {\Vi.jf)i^ij^n is a doubly stochastic 
matrix. These doubly stochastic matrices are called uni- stochastic or unitary- 
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stochastic. There exists doubly stochastic matrices which are not uni-stochastic, 
see |BEK+05j and [TanOlj . However, every permutation matrix is orthogonal and 
thus uni-stochastic. The Haar measure on the unitary group induces a probabil- 
ity distribution on the set of uni-stochastic matrices. How about the asymptotic 
spectral properties of the corresponding random matrices? 



Perron— Probenius eigenvector (invariant vector) 

If M ~ l{{A4n), then almost surely, all the entries of M are non-zero, and in 
particular, M is almost surely recurrent irreducible and aperiodic. By a theorem 
of Perron and Probenius |Sen06], it follows that almost surely, the eigenspace of 
associated to the eigenvalue 1 is of dimension 1 and contains a unique vector 
with non-negative entries and unit ||-||^-norm. One can ask about the asymptotic 
behavior of this vector as ^ oo. Por a fixed n, the distribution of this vector is 
the distribution of the rows of the infinite product of random matrices lim^^oo M'^. 



2 Structure of the Dirichlet Markov Ensemble 

Let An be as in ([1]). Por any a G (0, cxd)", the Dirichlet distribution P„(ai, . . . , a„) 
supported by A„, is defined as the distribution of 

1 ^ / Ci Gr, 



ll^lll V^l + ■ ■ ■ + Gn Gi "!-■■■ + Gr, 

where G is a random vector of MJ^ with independent entries with Gi ~ Gamma(l, 
for every 1 ^ i ^ n. Here Gamma(A, a) has density 

r(a) 

where r(a) = f^t°-^^e^^ dt is the Euler Gamma function. Let P ~ Vn{ai, . . . , a^) 
Por every partition Ji, . . . , of {1, . . . , n} into k non empty subsets, we have 

^Pi,. . . ,^Pi \ ~ P J ^ Oi, . . . , ^ Oi 

The mean and covariance matrix of Vn{ai, . . . , a„) are given by 

-a and 5 (||a||-^diag(a) — aa^) 



ll^lli l|a|li(l+ Iklli) 

where a = (oi, . . . , an)~^ and diag(a) is the diagonal matrix with diagonal given by 
a. Por any non-empty subset / of {1, . . . , n}, we have 

^ Pi ~ Beta y^Oj, 
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where Beta(a,/5) denotes the Euler Beta distribution on [0, 1] of Lebesgue density 



t ^ 



If Pi = {Pi)iei, Pi<^ = {Pi)i0, aj = {ai)i^i, and |/| = card(/), then 



1 



Pj and Pjc are independent and 



1 



Pi ~ 1^1/1(0/), 





For any a > 0, the Dirichlet distribution X'„(a, . . . ,a) is exchangeable, with nega- 
tively correlated components. More generally, if P ~ |U where /i is an exchangeable 
probability distribution on the simplex with n > 1, then 

= Var(l) = Var(Pi + ■ • ■ + P„) = nVar(Pi) + n{n - l)Cov(Pi, P2). 

Consequently, Cov(Pi,P2) = — (n — l)~^Var(Pi) and in particular Cov(Pi,P2) ^ 0. 

We refer for instance to |Wil62j for other properties of Dirichlet distributions. 
Corollary 11.21 follows immediately from theorem 11.11 together with the basic proper- 
ties of the Dirichlet distributions mentioned above. 

Proof of theorem As a subset of M", the simplex A^ defined by ([1]) is of zero 
Lebesgue measure. However, by considering A„ as a convex subset of the hyper-plane 
of equation + - ■ ■ + = 1 or by using the general notion of Hausdorff measure, one 
can see that in fact, the Dirichlet distribution P„(l, . . . , 1) is the normalized trace 
of the Lebesgue measure of M" on the simplex A„. In other words, T'„(l, . . . , 1) can 
be seen as the uniform distribution on A„, see |SZ90] . 

We identify with (A„)" = A„ x ■ ■ ■ x A„ where A^ is repeated n times. The 
trace of the Lebesgue measure of = (M")" on (A„)" is the n-tensor product of 
the trace of the Lebesgue measure of M" on A„, i.e. the n-tensor product measure 
'D„(l, . . . , 1)*^". Consequently, for every positive integer n. 



This gives the invariance of U{Ain) by permutation of rows. If M ~ then 
the rows of M are i.i.d. and follow the Dirichlet distribution T>n{^, . . . , 1). Finally, 
the invariance of by permutation of columns comes from the exchangeability 



Recursive simulation 

The simulation of U{M.n) follows from the simulation of n i.i.d. realizations of 
1^^1(1, . . . , 1) by using i.i.d. exponential random variables. The elements of Dyson's 
classical Gaussian ensembles CUE and GOE can be simulated recursively by adding 
a new independent line/column. It is natural to ask about a recursive method for 
the Dirichlet Markov Ensemble. If 



of the Dirichlet distribution P„(l, . . . , 1). 



□ 



X ~ 'D„_i(a2, . . . , a„) and Y ~ Beta(ai, 02 + • • • + ^n) 
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are independent, then 



(F,(l-r)X)~D„(ai 



, . . . , Ujfi I 



This recursive simulation of Dirichlet distributions is known as the stick-breaking 
algorithm |Set94j . It allows to simulate U{Ain) recursively on n. Namely, if M is 
such that M ~ U{Mn), then 

where Z is a random row vector of M""''^ with Z ~ . . . , 1) and F is a random 

column vector of with i.i.d. entries of law Beta(l, n), with M, Y, Z independent. 
Here ((1 - Y) ■ M)ij := (1 - Y)iMi,j for every ^n. 

Asymptotic behavior of the rows 

Let M and {Xij)i^ij^^ be as in theorem II. 5[ Let us fix /c ^ 1 and n ^ i ^ 1. The 
^th jY^oment rrin^i^k of the discrete probability distribution ^ X]j=i ^nMi j is given by 

1 



j=l ^ + ■■■+ Xi^n, 



xl + --- + xf^ 



Therefore, by using twice the strong law of large numbers, we get that almost surely. 



mil 



limm„,,,fc = ^;^V^ = E[XfJ. 



As a consequence, almost surely, for any fixed z ^ 1 and every k ^ 1, 



lim Wi, 

n— »oo 



J 



where £i = £(Ai i) is the exponential law on unit mean and where Wk{- ■) is the 
so called Wasserstein-Mallows coupling distance of order k (see for instance |Vil03] 
or |Rac91] ). This result is a special case of a more general well known phenomenon 
(sometimes referred as the Poincare observation) concerning the coordinates of a 
uniformly distributed random point on the unit || -Hp-sphere of MJ^ with 1 ^ p < oo 
when n —>■ oo, see for instance [NR03j . | JiaOQj . and references therein. 
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Semi— group structure and translation invariance 

The set is a semi-group for tlie usual matrix product. In particular, for every 
T G ^An, the set Ain is stable by the left translation M i— > TM and the right 
translation M i-^ MT. When T is a permutation matrix, then these translations are 
bijective maps, and the left translation (respectively right) translation corresponds 
to rows (respectively columns) permutations. 

For some fixed T G Ain, let us consider the left translation M i-^ TM, where 
M ~ U{A4n)- By linearity, we have 

E[TM1 = TE[M1 = T-1 = -1 

n n 

where 1 is the n x n matrix full of ones. Thus, the left translation by T leaves the 
mean invariant. 

Proof of theorem \1.3[ First of all, the case n = 1 is trivial and one can assume that 
n > 1 in the rest of the proof. A probability distribution fi on is invariant 
by the left translation M PM for every permutation matrix P of size n x if 
and only if /i is row exchangeable. Similarly, fi is invariant by the right translation 
M I— >^ MP for every permutation matrix P of size n x n if and only if /i is column 
exchangeable. Theorem 11.11 gives then the invariance of U{Ain) by left and right 
translations with respect to permutation matrice^. 

Conversely, let us assume that the law U{M.n) is invariant by the left translation 
M 1-^ TM for some T G M.n- If M ~ U{Ain), and since the components of the 
first column M.^i of M are i.i.d. we have 

Var((TM)i,i) = Var^^Ti,fcMfc,ij 

n 

= 5^(Ti,,)2Var(Mfc,i) 

k=l 

n 

= Var(Mi,i)5^(Ti,,)2. 

k=l 

The invariance hypothesis implies in particular that Var(Mi_i) = Var((TM)i i). 
Since Var(Mi,i) = {n - l)/{n'^{n + 1)) > 0, we get 1 = Efc=i(Ti,fc)^. Now, T is 
Markov and thus Ylk=i '^i,k = which gives 

n 

E(Ti,. - (Ti,.)^) = 0. 
fc=i 

Since T is Markov, its entries are in [0, 1] and hence Ti ^ G {0, 1} for every 1 ^ k ^ n. 
The condition ^2^=1 '^i,k = 1 gives then that the first line of T is an element of the 
canonical basis of M". The same argument used for (TM)fc^i for every 1 ^ k ^ n 

•^However that as a law over R"", U{M.n) is not exchangeable. The permutation of rows and 
columns correspond to a proper subset of the group of permutations of the entries. 
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shows that every hne of T is an element of the canonical basis, and thus T is a 
binary matrix with exactly a unique 1 on each row. Since TM ~ U{M.n), it has 
independent rows, and thus the position of the I's on the rows of T are pairwise 
different, which means that T is a permutation matrix as expected. 

Let us consider now the case where the law lA{M.n) is invariant by the right 
translation M i— MT for some T G Mn- If M ~ U{Mn), we can first take a look 
at the mean. Namely, E[MT] = E[M]T = i S where S is defined by 

n 
k=l 

for every 1 ^ i, j ^ n. Now, the invariance hypothesis gives on the other hand 

E[MT] = E[M] = -1 

Th 

and thus S = 1. which means that T is doubly stochastic, i.e. both T and are 
Markov. The invariance hypothesis implies also that 

n — 1 

Var((MT),0 = Var(M,0 = 

But since the first line Mi_. of M is r'„(l, • • • , 1) distributed, 
Var((MT)i,i) = E T,,iT,. iCov(Mi,,; Mi,,) 
n — 1 ^ 2 

^ ' 1=1 ^ ' l^i<j^n 

Since T is doubly stochastic, we have 1 = YT%=\ and thus 

The terms of the left and right hand side have opposite signs, which gives that 
Tj,i e {0, 1} for every 1 ^ i ^ n. The same method used for (MT)i,jfc for every 
1 ^ A; ^ n shows that T is a binary matrix. Since T is doubly stochastic, it follows 
that T is actually a permutation matrix, as expected. □ 

The set of n x n permutation matrices is a discrete subgroup of the orthogonal 
group of M", isomorphic to the symmetric group S„. The group of permutation 
matrices plays for the Dirichlet Markov Ensemble the role played by the orthogonal 
group for Dyson's GOE or COE, and the role played by the unitary group for 
Dyson's GUE or CUE. In some sense, we replaced an I? Gaussian structure by an 

Dirichlet structure while maintaining the permutation invariance. 

A very natural question is to ask about the existence of a convolution idempotent 
probability distribution on the compact semi-group M.n- Recall that a probability 
distribution /x on a semi-group © is idempotent if and only \i \i = [i. Here the 
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convolution /i * z/ of two probability distributions fi and u on & is defined, for every 
bounded continuous / : 6 ^ M, by 



f{s)d{iJ,*u){s)= / I / f{siSr)d^{si)\dv{sr). 
e Je\Je J 

Actually, the structure of compact semi-groups and their idempotent measures was 
deeply investigated in the 1960's, see |Ros71t p. 158-160] for a historical account. 
In particular, one can find in [Ros71l lem. 3] the following result. 

Lemma 2.1. Let ^ be a regular probability distribution over a compact Hausdorff 
semi-group & such that the support of ^ generates 6 . Then the mass of the con- 
volution sequence fi*^ concentrates on the kernel K{&) of &. More precisely, for 
every open set O containing K and every e > 0, there exists a positive integer 
such that yU*"(0) > 1 — e for every n n,.. 

Here /x*" denotes the convolution product fi*- ■ ■* of n copies of fi. If yU*" tends 
to fi as n —>■ oo then /i is convolution idempotent, that is n * fi = fi. The kernel 
K{&) of & is the sub-semi-group of & obtained by taking the intersection of the 
family of two sided ideals of &, see |Ros7H th. 1]. A direct consequence of lemma 
12.11 is the absence of a translation invariant probability measure n on & with full 
support such that the kernel of (5 is a /i-proper sub-semi-group of &. By /i-proper 
sub-semi-group here we mean that its /i-measure is < 1. This result can be easily 
understood intuitively since the translation associated to a non invertible element 
of & gives a strict contraction of the support. 



Proof of theorem l.^-^ The kernel of the semi-group M.n is constituted by the nxn 
Markov matrices with equal rows, which are the n x n idempotent Markov matrices 
(i.e. = M). The reader may find more details in [Ros7H p. 146]. Since the 
kernel of M.n is a W(A^„)-proper sub-semi-group of M-n-, lemma 12.11 implies the 
absence of any convolution idempotent probability distribution on A^„, absolutely 
continuous with respect to U{A4n) and with full support. The proof is finished by 
noticing that if a probability distribution on Ain is invariant by every left (or right) 
translation, then it is convolution idempotent. Note by the way that the Wedder- 
burn matrix -1 belongs to the kernel of A^„, and also that this kernel is equal to 
{limfc^oo M'^; M G An} where An is the collection of irreducible aperiodic elements 
of Ain- The reader may find in |Ros7H ch. 5] the structure of non fully supported 
idempotent probability distributions on compact semi-groups and in particular on 
Mn. □ 



3 Proofs of theorem 1.5 



The following theorem can be found for instance in |BS061 th. 3.6]. 

Theorem 3.1 (Singular values of large dimensional non-centered random arrays). 
Let (Xjj)i^jj<oo be an infinite array of i.i.d. real random variables with mean m 
and variance cr^ G (0, oo). //X = (A'jj)i^jj^„, then 



\ k=l 
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where denotes the weak converge of probability distributions and Qo- is the Wigner 
quarter-circle distribution defined in tableUi Moreover, 

Pf lim Si(n"^/^X) = 2a^) = 1 if and only if E[Xi,i] = ant/ E[|Xi,i|^] < oo. 

The following lemma is a consequence of |BY93l le. 2] (see also |BS06t le. 5.13]). 



Lemma 3.2 (Uniform law of large numbers). // (Xjj)i^jj<oo is an infinite array of 



i.i.d. random variables of mean m, then by denoting Si^n = Yl^=i-^i,j> 



m 



max 

and in the case where m ^ 0, we have also 

n 1 

max 



^ 



Si,n. ^ 



^ 0. 



The following lemma is a consequence of the Courant-Fischer variational formu- 
las for singular values, see |HJ90] . Also, we leave the proof to the reader. 

Lemma 3.3 (Singular values of diagonal multiplicative perturbations). For every 
n X n matrix A, every n x n diagonal matrix D, and every 1 ^ k ^ n, 

s„(D)sfc(A) ^ Sfe(DA) ^ si(D)sfe(A). 

We are now able to prove theorem 11.51 

Proof of theorem ] 1.5[ We have M = DE where E = {Xij)i^ij^n and D is the nxn 
diagonal matrix given for every 1 ^ i ^ n hj 



The fact that M ~ U{M.n) follows immediately from theorem 11.11 combined with the 
construction of the Dirichlet distribution P„(l, . . . , 1) from i.i.d. exponential random 
variables. It remains to prove the convergence of the ESD of ^/nMMJ as n ^ oo to 
the Wigner quarter-circle distribution Qi. For such, we use the method of Aubrun 
|Aub06] ■ by replacing the unit ||-||^-ball by the portion of the unit ||- ||-,^-sphere with 
non-negative coordinates. If suffices to show that almost surely, the discrete measure 
n X]fc=2 '^Sfc(v^M) tends weakly to the Wigner quarter-circle distribution Qi. 

We first observe that E is a rank one additive perturbation of the centered 
random matrix E — EE. Also, a standard interlacing inequality gives 

S2(E) ^ si(E-EM). 

Now by the second part of theorem 13. II we have si(E — EE) = 0{y/n) almost surely. 
Consequently, S2(n~^/^E) = 0(1) almost surely. In particular, almost surely, the 
sequence (^ ^^=2 ^s^in-^/'^ E))n^i remains in a compact set. The desired result follows 
then from the combination of the first part of theorem 13.11 with lemmas 13.31 and 13.21 
This proof does not rely on the exponential nature of the A^jj 's and remains actually 
valid for more general laws, see |Cha08] . □ 
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There is no equivalent of lemma 13.31 for the eigenvalues instead of the singular 
values, and thus the method used to prove theorem [L5] fails for conjecture II .Gi Note 
that by lemma [221 used with the exponential distribution of mean m = 1, 



|nD — I||2^2 ~ max 



n 



— > a.s. 

n^oo 



If A is diagonal, then we simply have || A||2^2 ~ '^i(^) — uiaxi^fc<g„ |Afc,fc|, and when 
A is diagonal and invertible, ||A-i||2J^2 = s,t(A) = miui^fc^^ I Now, by the 
circular law theorem for non-central random matrices [Cha07j . we get that almost 
surely, the ESD ofn-^/^E converges, as n ^ oo, to the uniform distribution Ci (see 
table [1]). It is then natural to decompose ^/nlSA as 

V^M = e = (nD - l)n-^/^ E + n-^^^ E. 

Unfortunately, since m = 1 7^ 0, we have almost surely (see |Cha07] ) 

\\n-^/^EL =Si(n-i/2E) _^ +00. 

This suggests that y/n'Wl cannot be seen as a perturbation of n"^/^ E with a matrix 
of small norm. Actually, even if it was the case, the relation between the two spectra 
is unknown since E is not normal. One can think about using logarithmic potentials 
to circumvent the problem. The strength of the logarithmic potential approach is 
that it allows to study the asymptotic behavior of the ESD (i.e. eigenvalues) of non- 
normal matrices via the singular values of a family of matrices indexed by 2; G C. 
The details are given in |Cha07j for instance. The logarithmic potential of the ESD 
of y/nM. at point z is 

Un(z) = --log|det(v/nM-2l)| 

n ' ' 

= --log|det(nD)| - -\og\det{n-^/^ E - z{nBy^)\. 
Now, by lemma 13.21 

-log|det(nD)| — y a.s. 

n n~-*oo 

By the circular law theorem for non-central random matrices |Cha07] and the lower 
envelope theorem |ST97j . almost surely, for quasi-everjQ 2; e C, the quantity 

liminf — — log \ det{n~^^'^ E — 2;!) I 

is equal to the logarithmic potential at point z of the uniform distribution Ci on the 
unit disc {z G C; 1^1 ^ 1}. It is thus enough to show that almost surely, for every 
zeC, 

-log|det(n-i/2E- z(nD)-i)| - -log\det{n'^^^ E - zl)\ — > 0. 

n n n^oo 



*This means "except on a subset of zero capacity" , in the sense of potential theory, see [ST97| . 
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Unfortunately, we ignore how to prove that. A possible alternative beyond potential 
theoretic tools is to adapt the method developed in |TVK08j by Tao and Vu involving 
a "replacement principle" . The reader may find some progresses in |Cha08] . 
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Figure 1: Plot of the spectrum of a single realization of y/nlSA where M ~ U{Ain) 
with n = 81. We see one isolated eigenvalue Xl{^/n'M.) = ^Jn = 9 while the rest 
of the spectrum remains near the unit disc and seems uniformly distributed, in 
accordance with conjecture 11.61 
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Figure 2: Here 1000 i.i.d. realizations of ^/n'M. where simulated where M ~ U{M.n) 
with n = 300. The first plot is the histogram of |A2(v^M)|, i.e. the module 
of the sub-dominant eigenvalue A2(v^M). The second plot is the histogram of 
|Phase(A2(v^M))|. Recall that the spectrum is symmetric with respect to the real 
axis since the matrices are real. 



20 





1 

.if- 


1 


1 




/ 
/ 


1 


%: 

ty 
t 



-1.5 -1 -0.5 0.5 1 1.5 



Figure 3: Here we reused the sample used for figure [2J The graphic is a plot of 
the 1000 i.i.d. realizations of the sub-dominant eigenvalue X2{y/n'M.). Since we deal 
with real matrices, the spectrum is symmetric with respect to the real axis, and we 
plotted (RealPart(A2), |ImaginaryPart(A2)|) in the complex plane. 
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